Phoneme classification and lattice rescoring based on a k-NN approach

نویسندگان

  • Ladan Golipour
  • Douglas D. O'Shaughnessy
چکیده

In this paper we propose a k-NN/SASH phoneme classification algorithm that competes favourably with state-ofthe-art methods. We apply a similarity search algorithm (SASH) that has been used successfully for classification of high dimensional texts and images. Unlike other search algorithms, the computational time of SASH is not affected by the dimensionality of the data. Therefore, we generate fixed-length but high-dimensional feature vectors for phonemes using their underlying frames and those of their boundaries. The k-NN/SASH phoneme classifier is fast, efficient, and could achieve a classification rate of 79.2% for the TIMIT test database. Finally, we apply this algorithm to rescore phoneme lattices, generated by the GMMHMM monophone recognizer for both context-independent and context-dependent tasks. In both cases, the k-NN/SASH classifier leads to improvements in the recognition rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting deep neural networks for detection-based speech recognition

In recent years deep neural networks (DNNs) – multilayer perceptrons (MLPs) with many hidden layers – have been successfully applied to several speech tasks, i.e., phoneme recognition, out of vocabulary word detection, confidence measure, etc. In this paper, we show that DNNs can be used to boost the classification accuracy of basic speech units, such as phonetic attributes (phonological featur...

متن کامل

A hybridization of evolutionary fuzzy systems and ant Colony optimization for intrusion detection

A hybrid approach for intrusion detection in computer networks is presented in this paper. The proposed approach combines an evolutionary-based fuzzy system with an Ant Colony Optimization procedure to generate high-quality fuzzy-classification rules. We applied our hybrid learning approach to network security and validated it using the DARPA KDD-Cup99 benchmark data set. The results indicate t...

متن کامل

A Rescoring Approach for Keyword Search Using Lattice Context Information

In this paper we present a rescoring approach for keyword search (KWS) based on neural networks (NN). This approach exploits only the lattice context in a detected time interval instead of its corresponding audio. The most informative arcs in lattice context are selected and represented as a matrix, where words on arcs are represented in an embedding space with respect to their pronunciations. ...

متن کامل

Comparison of Distance Metrics for Phoneme Classification based on Deep Neural Network Features and Weighted k-NN Classifier

K-nearest neighbor (k-NN) classification is a powerful and simple method for classification. k-NN classifiers approximate a Bayesian classifier for a large number of data samples. The accuracy of k-NN classifier relies on the distance metric used for calculating nearest neighbor and features used for instances in training and testing data. In this paper we use deep neural networks (DNNs) as a f...

متن کامل

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010